{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "877baeec-ea53-467b-8438-9900f7b66a61",
   "metadata": {},
   "source": [
    "# Lesson 1: Overview\n",
    "\n",
    "In this lesson, you will:\n",
    "1. Explore the dataset of LLM prompts and responses named **chats.csv** that we’ll use throughout this course.\n",
    "2. Get a fast demo overview of all the techniques showcased in greater detail in later lessons.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "589deee9-da2e-4e74-b193-42ace8d58f69",
   "metadata": {},
   "source": [
    "## Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fd87ead8-ef98-4243-b67d-fd736d6397b7",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "import helpers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "689dfb3b-a599-450a-b3f2-7d882da9633a",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "267bab21-796f-4fe7-bfc1-faf82bde91fd",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "chats = pd.read_csv(\"./chats.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c9455a8d-badf-4b70-a90a-6443c58d9fad",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "chats.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "efd39a5d-4950-4c1f-83d7-29057ad29cf2",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "pd.set_option('display.max_colwidth', None)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "304842fe-9483-4e96-bbd0-d956260e5824",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "chats.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d8b0808-1f09-49dc-b5ef-34a63626ba7b",
   "metadata": {},
   "source": [
    "## Setup and explore whylogs and langkit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7c4f26b8-9bb5-418c-8e38-e1408aa2f80d",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "import whylogs as why"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b48e1131-2fce-4c59-a4f7-60dfe4e8d917",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "why.init(\"whylabs_anonymous\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ccc1cd0b-d77f-42fc-aeae-917397df6f91",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from langkit import llm_metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03260f49-ea45-48a9-b90c-3019503f4a4d",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "schema = llm_metrics.init()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "261c2dc1-72dc-46f8-b1cd-3f00d7906d1f",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "result = why.log(chats,\n",
    "                 name=\"LLM chats dataset\",\n",
    "                 schema=schema)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef256947-b3c4-4a99-96b1-fa354f6533c5",
   "metadata": {},
   "source": [
    "### Prompt-response relevance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3f28908-7a35-4ff7-8afc-8edbcd07f470",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from langkit import input_output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1e955c3",
   "metadata": {},
   "source": [
    "**Note**: To view the next visual, you may have to either hide the left-side menu bar or widen the notebook towards the right."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a068589b-d0c9-4e00-8d8d-0b3dee42d056",
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats,\n",
    "    \"response.relevance_to_prompt\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9790b7a6-885e-477e-b615-37176b4ce234",
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "helpers.show_langkit_critical_queries(\n",
    "    chats,\n",
    "    \"response.relevance_to_prompt\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9072f821-e26d-43ba-88e9-59f60cfcf1df",
   "metadata": {},
   "source": [
    "### Data Leakage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc73ec06-2f40-412a-b709-f79b600cb153",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from langkit import regexes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab8eaca3",
   "metadata": {},
   "source": [
    "**Note**: To view the next visuals, you may have to either hide the left-side menu bar or widen the notebook towards the right."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4139c04d-a2d1-4c7c-83a8-78c4206e2818",
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats,\n",
    "    \"prompt.has_patterns\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "73d0e55c-917e-4bf7-a11c-1335949345d8",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats, \n",
    "    \"response.has_patterns\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "afba5e12-03f7-410d-80dc-1494638bbc8b",
   "metadata": {},
   "source": [
    "### Toxicity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85289ec5-7a26-42fc-bcb5-b43b91783adf",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from langkit import toxicity"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99bfaa88",
   "metadata": {},
   "source": [
    "**Note**: To view the next visuals, you may have to either hide the left-side menu bar or widen the notebook towards the right."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bf6c4a9a-3bc3-4a80-b0b4-6cf4043084e2",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats, \n",
    "    \"prompt.toxicity\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06ae8e5f-68aa-4e8f-9d77-3e57cdec3054",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats, \n",
    "    \"response.toxicity\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db6deb13-d6ef-4008-a44c-d68b938deacf",
   "metadata": {},
   "source": [
    "### Injections"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e69327e7-5357-43fa-808a-40db065a7689",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from langkit import injections"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "880ef5d7",
   "metadata": {},
   "source": [
    "**Note**: To view the next visual, you may have to either hide the left-side menu bar or widen the notebook towards the right."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7bef8ba8-5ce3-4a9e-8db9-7807fb9f092f",
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "helpers.visualize_langkit_metric(\n",
    "    chats,\n",
    "    \"injection\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19838ff8-668f-4765-b385-fb4a43c992e5",
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "helpers.show_langkit_critical_queries(\n",
    "    chats,\n",
    "    \"injection\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "658165d4-5340-4c8d-b4b5-b665384029fc",
   "metadata": {},
   "source": [
    "## Evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1a62ad5e-522e-4923-a12b-3f235b0119c0",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "helpers.evaluate_examples()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a29973dd-9976-4fb9-94bd-e296598b635b",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "filtered_chats = chats[\n",
    "    chats[\"response\"].str.contains(\"Sorry\")\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e9032dd-82c1-49fa-bc3b-5bbe30bd8a73",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "filtered_chats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d4cc77e-cec5-401e-b324-5e0174c4e400",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "helpers.evaluate_examples(filtered_chats)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03f114d0-3cd1-4b5c-a318-54214f341120",
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "filtered_chats = chats[\n",
    "    chats[\"prompt\"].str.len() > 250\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16058418-c298-44ff-a199-956c0f215f47",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "filtered_chats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb32f1ec-7d97-4a52-825e-6fe1972cdac4",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "helpers.evaluate_examples(filtered_chats)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9d17cc4-18e7-4c10-a4c8-ac4cce1ebc39",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ab0ae31-beb3-4e1e-b93a-1fb0af44cbfb",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b42130d-0d2d-4c7a-9126-78ab10d4ea4e",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2a4228a8-9cc5-4dfa-9592-305e7602d6c6",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e9e0383-1e5a-46e7-b3e1-27b265057dde",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "97b40ab4-6d04-4fa8-b466-104d0fc98ab0",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79905fba-f34b-4fd5-9571-4f3b398cbb70",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da5268fa-01c5-4467-b2f2-a02c905ec115",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "82dca680-fff2-4b15-9c80-d0ac940b74e5",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2b8bf47d-4139-43b8-9c7d-ee52a6f2df93",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13019db8-6b08-401b-923e-65ac320f9e63",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14f93928-1ce6-4ce5-a2b0-5dd47563efcd",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d480e7a0-b6a2-4dbe-8762-da29579a6715",
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
